A new algorithm for comparing and visualizing relationships between hierarchical and flat gene expression data clusterings

نویسندگان

  • Aurora Torrente
  • Misha Kapushesky
  • Alvis Brazma
چکیده

MOTIVATION Clustering is one of the most widely used methods in unsupervised gene expression data analysis. The use of different clustering algorithms or different parameters often produces rather different results on the same data. Biological interpretation of multiple clustering results requires understanding how different clusters relate to each other. It is particularly non-trivial to compare the results of a hierarchical and a flat, e.g. k-means, clustering. RESULTS We present a new method for comparing and visualizing relationships between different clustering results, either flat versus flat, or flat versus hierarchical. When comparing a flat clustering to a hierarchical clustering, the algorithm cuts different branches in the hierarchical tree at different levels to optimize the correspondence between the clusters. The optimization function is based on graph layout aesthetics or on mutual information. The clusters are displayed using a bipartite graph where the edges are weighted proportionally to the number of common elements in the respective clusters and the weighted number of crossings is minimized. The performance of the algorithm is tested using simulated and real gene expression data. The algorithm is implemented in the online gene expression data analysis tool Expression Profiler. AVAILABILITY http://www.ebi.ac.uk/expressionprofiler

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The clustComp package

This document presents an overview to the clustComp package, which is a collection of tools developed for the comparison and the visualisation of relationships between two clusterings, either flat versus flat or flat versus hierarchical. Both situations are addressed by representing clusters as nodes in a weighted bipartite graph, where each layer corresponds to one of the clusterings under com...

متن کامل

Presenting a new equation for estimation of daily coefficient of evaporation pan using Gene Expression Programming and comparing it with experimental methods (Case Study: Birjand Plain)

One of the most important componenets of water management in farms is estimating crops’ exact amount of  evapotranspiration (water need). The FAO-Penman-Montheis (FPM) method is a standard method to evaluate other techniques which are used for easy calculation of potential evapotranspiration, when lysimeter datasheets are not available. This study was carried out based on 18 years’ climatic dat...

متن کامل

On comparing clusterings: an element-centric framework unifies overlaps and hierarchy

Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for tasks such as clustering evaluation, consensus clustering, and tracking the temporal evolution of clusters. For example, the extrinsic evaluation of clustering methods requires comparing the unc...

متن کامل

QC4 - A Clustering Evaluation Method

Many clustering algorithms have been developed and researchers need to be able to compare their effectiveness. For some clustering problems, like web page clustering, different algorithms produce clusterings with different characteristics: coarse vs fine granularity, disjoint vs overlapping, flat vs hierarchical. The lack of a clustering evaluation method that can evaluate clusterings with diff...

متن کامل

Graph-Based Hierarchical Conceptual Clustering

Hierarchical conceptual clustering has proven to be a useful, although under-explored, data mining technique. A graph-based representation of structural information combined with a substructure discovery technique has been shown to be successful in knowledge discovery. The SUBDUE substructure discovery system provides one such combination of approaches. This work presents SUBDUE and the develop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 21 21  شماره 

صفحات  -

تاریخ انتشار 2005